Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallelize x.py test tidy #81833

Merged
merged 3 commits into from
Feb 21, 2021
Merged

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Feb 6, 2021

Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit.

running time ./x.py test tidy with warm IO caches:

old:

real	0m11.123s
user	0m14.495s
sys	0m5.227s

new:

real	0m1.834s
user	0m13.545s
sys	0m3.094s

There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 6, 2021
@jyn514 jyn514 added A-contributor-roadblock Area: Makes things more difficult for new contributors to rust itself A-testsuite Area: The testsuite used to check the correctness of rustc labels Feb 6, 2021
@bors
Copy link
Contributor

bors commented Feb 10, 2021

☔ The latest upstream changes (presumably #81768) made this pull request unmergeable. Please resolve the merge conflicts.

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2021

rebased and switched from rayon to crossbeam-utils.

src/bootstrap/format.rs Show resolved Hide resolved
src/bootstrap/format.rs Show resolved Hide resolved
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ~5x win you list is nice, and it doesn't look like this adds too much complexity, so it's overall not too bad... I am inclined to merge, but would like to spend a bit of time thinking about it.

let mut temp_path = path.join("tidy-test-file");
//
// We also append the thread ID to avoid threads trampling on each others files.
let file_name = format!("t{}.tidy-test-file", std::thread::current().id().as_u64());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies we're visiting the same path more than once, which seems surprising to me - why is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this isn't actually needed. I did this change before the one in filter_dirs. I'll retest with only the filter_dirs change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thread '' panicked at 'Deleted temp file: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/tools/tidy/src/bins.rs:43:46

Looks like adding the thread ID was necessary after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not satisfied with just an error message - something is wrong with at least my mental model of the code, and that leads me to believe this may just be hiding some other bug. I would like to see an explanation for why the error is happening.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possibility is that the first create fails and then the 2nd attempt uses the output dir which is shared between the passes.

match fs::File::create(&temp_path).or_else(|_| {
temp_path = output.join("tidy-test-file");
fs::File::create(&temp_path)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not parallelize this pass perhaps to avoid this? I'm not convinced by your explanation I guess and it feels subtle; I'd rather have slow correct code than wrong parallel code :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I don't quite understand the concern. The tests are largely non-mutating. This is the only mutating part and the fix I did would be necessary for correctness anyway due to to the possibility mentioned above.

So what do you fear might be going wrong beyond this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modified comment in the code doesn't suggest a correctness fix to me, so maybe I overlooked that - if there's a correctness fix here, I'm not yet seeing how we've avoided the problem by way of thread ID... does this problem occur before your PR? If so, then I don't see how adding thread ID helps, as previously it'd always be zero, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was trying to say is that it is a necessary part of making these checks concurrent because the code path creating the files may create them in either the source or the output folder and the output folder is shared. So if two checks run concurrently then they might create the file twice and try to delete it twice which will fail if it's not a unique file name.

Your concern seems to be that it's not sufficient, but I don't see anything else that might trigger this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW (I guess we're moving it to a separate PR) I think I'd suggest we move this out of the parallel code or cache the results; they shouldn't change and doing it just once seems simpler than trying to make sure we don't clobber each other. It also helps us avoid accidentally getting different results in different threads, which could be quite confusing.

@Mark-Simulacrum
Copy link
Member

Also, one thing we'll want to do is respect -j when passed to x.py, for both tidy and fmt checking. Ideally we'd also utilize the jobserver crate to properly handle parallelism when running under a larger make, but that seems complicated.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2021
@the8472
Copy link
Member Author

the8472 commented Feb 13, 2021

Ideally we'd also utilize the jobserver crate to properly handle parallelism when running under a larger make, but that seems complicated.

Bootstrap currently doesn't provide a jobserver, right? Only cargo does afaict.

Getting the testsuite to run under a jobserver would be nice since it could unlock some additional room for parallelism - e.g. building stage 0 compiletest or rustdoc concurrent with the stage 1 compile - but that seems like a much bigger task overall.

@the8472
Copy link
Member Author

the8472 commented Feb 13, 2021

-j1 makes things slow again.

@rustbot label: +S-waiting-on-review -S-waiting-on-author

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 13, 2021
@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member

Local measurement suggests that 70% of the tidy runtime comes from rustfmt, so it might be good to just skip the parallelization of the tidy part (avoiding complexity there), and retain just the rustfmt parallelism; ignore is already walking the tree in parallel today so there's not much added complexity there.

Does that make sense to you?

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 20, 2021
@the8472
Copy link
Member Author

the8472 commented Feb 20, 2021

I'd prefer if the runtime were even lower, rather than higher. But if that's the only way to get this merged... what choice do I have?

I would prefer to understand your concerns though.

@the8472
Copy link
Member Author

the8472 commented Feb 20, 2021

I'll split the tidy changes into a separate PR so they can be discussed on their own.

@Mark-Simulacrum
Copy link
Member

Well, there is a cost to the parallelism being added to tidy - crossbeam is not the lightest of dependencies, and I'd expect a bug report down the line that we're not handling jobserver tokens right (when -j1 is passed to make, rather than x.py directly).

Ultimately I think we can agree that there's a cost to the added parallelism; reduced but not eliminated by the safety guarantees Rust provides. I'm not sure that justifying that tradeoff for a few seconds of time is worth it, but it's also true that I don't really run tidy locally at all; others' workflows may involve calling it more often, and that may mean that they care more about this. It's hard for me to accept added code and complexity for something that at least for me is primarily a CI cost.

The rustfmt parallelism is presumably where the majority of the wins this PR brings come from, and so I'm more willing to accept those - they're also in theory less impactful from a maintenance perspective, as they're entirely local (i.e., we just call rustfmt with more processes at once). I wouldn't be surprised if we could get similar wins without any parallelism, just batching ~15-20 files to rustfmt at a time, but maybe my calculus is off there. Finding a way to avoid --skip-children might be even more rewarding

@the8472
Copy link
Member Author

the8472 commented Feb 20, 2021

others' workflows may involve calling it more often, and that may mean that they care more about this.

Yeah, I run it frequently when doing interactive git rebase. I was also thinking of using it as a commit hook but it's still too slow for that.

I wouldn't be surprised if we could get similar wins without any parallelism, just batching ~15-20 files to rustfmt at a time

It's already batching (N = 8). I measured performance win from batch sizes in this discussion and larger batches provide barely any speedups. I also ran this under perf, most of the wins are from parallelism.

Well, there is a cost to the parallelism being added to tidy - crossbeam is not the lightest of dependencies

I can write this without crossbeam if you want, 1 thread per check, up to -j threads or something. Anyway... we can discuss it in a separate PR soon.

and I'd expect a bug report down the line that we're not handling jobserver tokens right (when -j1 is passed to make, rather than x.py directly).

Eh, this only runs for a few seconds, so it shouldn't screw with anyone's build system. I mean jobserver support for all of bootstrap is a great idea, but by no means trivial, at least if we want to fully utilize it e.g. by parallelizing build phases and ordering them before test phases. But I don't see it as a killer feature in this particular case.

There are much bigger fish not obeying jobserver tokens, e.g. compiletest which runs for minutes.

old:

```
real	0m11.123s
user	0m14.495s
sys	0m5.227s
```

new:

```
real	0m2.767s
user	0m13.014s
sys	0m1.691s
```
@the8472
Copy link
Member Author

the8472 commented Feb 20, 2021

Updated timings with just the rustfmt changes

real	0m2.847s
user	0m13.466s
sys	0m1.782s

@Mark-Simulacrum
Copy link
Member

@bors r+ rollup

Ok, thanks for splitting things out. tidy and rustfmt are pretty orthogonal anyway so it seems good to land them separately. I think I'm still interested in seeing the tidy changes - see my last comment on the one direct problem I wanted to remove before landing those changes.

@bors
Copy link
Contributor

bors commented Feb 20, 2021

📌 Commit c071970 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 20, 2021
@the8472 the8472 mentioned this pull request Feb 20, 2021
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Feb 21, 2021
… r=Mark-Simulacrum

parallelize x.py test tidy

Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit.

running `time ./x.py test tidy` with warm IO caches:

old:

```
real	0m11.123s
user	0m14.495s
sys	0m5.227s
```

new:

```
real	0m1.834s
user	0m13.545s
sys	0m3.094s
```

There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Feb 21, 2021
… r=Mark-Simulacrum

parallelize x.py test tidy

Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit.

running `time ./x.py test tidy` with warm IO caches:

old:

```
real	0m11.123s
user	0m14.495s
sys	0m5.227s
```

new:

```
real	0m1.834s
user	0m13.545s
sys	0m3.094s
```

There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 21, 2021
Rollup of 11 pull requests

Successful merges:

 - rust-lang#81300 (BTree: share panicky test code & test panic during clear, clone)
 - rust-lang#81706 (Document BinaryHeap unsafe functions)
 - rust-lang#81833 (parallelize x.py test tidy)
 - rust-lang#81966 (Add new `rustc` target for Arm64 machines that can target the iphonesimulator)
 - rust-lang#82154 (Update RELEASES.md 1.50 to include methods stabilized in rust-lang#79342)
 - rust-lang#82177 (Do not delete bootstrap.exe on Windows during clean)
 - rust-lang#82181 (Add check for ES5 in CI)
 - rust-lang#82229 (Add [A-diagnostics] bug report template)
 - rust-lang#82233 (try-back-block-type test: Use TryFromSliceError for From test)
 - rust-lang#82302 (Remove unsafe impl Send for CompletedTest & TestResult)
 - rust-lang#82349 (test: Print test name only once on timeout)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4c1f195 into rust-lang:master Feb 21, 2021
@rustbot rustbot added this to the 1.52.0 milestone Feb 21, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 4, 2021
…lacrum

Parallelize tidy

Split off from rust-lang#81833

While that PR brings wall time of `x.py test tidy` down to 0m2.847s adding this one on top should bring it down to 0m1.673s.

r? `@Mark-Simulacrum`

Previous concerns can be found at rust-lang#81833 (comment) and rust-lang#81833 (comment)
@jyn514 jyn514 mentioned this pull request Dec 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-contributor-roadblock Area: Makes things more difficult for new contributors to rust itself A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants